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Abstract. We examine the problem of predicting the evolution of solutions of the Kuramoto- 
Sivashinsky equation when initial data are missing. We use the optimal prediction method to con- 
struct equations for the reduced system. The resulting equations for the resolved components of the 
solution are random integrodifferential equations. The accuracy of the predictions depends on the 
type of projection used in the integral term of the optimal prediction equations and on the choice of 
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1. Introduction. Computer applications to science and engineering continue 
to expand. However, there is still a wealth of problems where the computational 
power we can bring to bear is not sufficient and thus the calculations we perform are 
underresolved. This situation leads to a twofold question: What quantities should 
one strive to predict and what are the equations that describe the evolution of these 
quantities? 

The purpose of this work is to examine the problem of prediction in the presence 
of underresolution for the Kuramoto-Sivashinsky equation which models pattern for- 
mation in different physical contexts |^ |^ . We will be using the equation in the 
form 

Vt + vvx + v^x + vvxx.j;x = 0, (1.1) 

in the domain [0, 27r] with periodic boundary conditions and initial condition Vq{x). 
The parameter v has the conventional interpretation of viscosity and controls the rate 
of energy dissipation at the small scales. Depending on the value of the parameter 
V the solutions can exhibit a wide range of behavior, from steady states, to periodic 
solutions in time, to solutions with coherent spatial structures that evolve chaotically 
in time (such solutions will be called chaotic), to solutions that exhibit temporal 
chaoticity together with long range spatial statistical independence (spatio-temporal 
chaos) |H ll2l[T??) . The problem of modelling the behavior of the Kuramoto-Sivashinsky 
equation using a reduced set of variables has proved to be a formidable one and 
different tools have been used towards this goal i40i (SSJ ISH] • 

We are addressing the problem of underresolved computations through the method 
of optimal prediction ^] . In order to proceed in the formulation of the equations 
for the quantities that will be predicted it is necessary to have some knowledge about 
the degrees of freedom that are unresolved. In optimal prediction what is assumed 
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as known is an initial measure on the space of solutions. What is sought is a mean 
solution with respect to this initial measure, compatible with the partial information 
available initially as well as with the limitations on the available computing power. 
This mean solution is the conditional expectation of the solution given the partial 
initial data, and is the best available estimate of the solution of the full problem. The 
optimal prediction method consists of ways to construct equations that will allow the 
calculation of the conditional expectation of the solution. 

We are looking at a Galerkin Fourier truncation of the Kuramoto-Sivashinsky 
equation, thus, the solution of the equation is determined through the integration 
in time of a system of ordinary differential equations for the Fourier modes. The 
resolved variables will be a set of Fourier modes, smaller than what is needed for 
a full description of the system. The goal is to produce estimates for the evolution 
of the conditional expectations of the resolved Fourier modes conditioned on their 
prescribed initial values. 

The first step towards this goal is to compute the true conditional expectations 
(what will be later called the truth) . To do that requires the knowledge of a measure 
on the space of all the Fourier modes (resolved and unresolved). The Kuramoto- 
Sivashinsky equation has no natural candidate for this measure. We evolve repeat- 
edly, using initial conditions drawn from a uniform distribution, the equations needed 
for the full description of the system and collect the values of all the Fourier modes 
after some prescribed time interval. In this way we obtain a collection of samples of 
the values of the Fourier modes. We, then, use maximum likelihood estimation to 
construct an approximation to the measure that describes the distribution of these 
samples. We construct a diagonal Gaussian approximation to the measure (the rea- 
son for the use of a diagonal measure is given later). The reason that we choose a 
Gaussian measure is that, inspection of the first few moments of the collection of 
samples, reveals that some of the modes have Gaussian statistics. Our future plan 
is to construct more elaborate approximations to the measure (e.g. mixtures of di- 
agonal Gaussian measures using the Expectation-Maximization algorithm [2H|)- The 
measure constructed is sampled, keeping the resolved Fourier modes fixed to their 
prescribed values. We use the samples as initial conditions to integrate the equations 
for all the Fourier modes. The true conditional expectations of the resolved modes 
given their initial values, are the averages of the resolved modes over the samples. 

The second step is to produce estimates for these conditional expectations. To 
do that we need to construct equations for the evolution of the resolved modes. This 
is achieved through the use of the optimal prediction formalism that utilizes the 
measure constructed through maximum likelihood estimation. We use the optimal 
prediction formalism in a certain limit called the short-memory approximation. This 
approximation is valid if the resolved modes depend only on their recent history. The 
short-memory approximation is more general than the delta-function approximation 
used frequently in the statistical physics literature, where the resolved modes' evolu- 
tion is assumed to depend only on their present values. In fact, the delta- function 
approximation can be derived as a limiting case of the short-memory approximation. 

The optimal prediction equations for the resolved modes, in the short-memory 
approximation, are random ordinary integrodifferential equations. They contain a 
Markovian term, depending only on the resolved modes' current values, a memory 
term that depends on the recent values of the resolved modes and a random term 
that depends on the unresolved modes. The optimal prediction equations are solved 
repeatedly for different realizations of their random part. The initial conditions for 
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the resolved modes are the same for all realizations. The estimates of the conditional 

expectations of the resolved modes given their initial values arc the averages, over 
the different realizations, of the values of these modes. We compare the estimates 
produced in this way to the true conditional expectations. The results depend on the 
set of variables that are resolved and on the type of projection used in the memory 
term. If the set of resolved variables includes all the linearly unstable modes and the 
projection of the memory term is on the span of the resolved modes (linear projec- 
tion), the agreement between the estimates of the conditional expectations and the 
true conditional expectations is good for relatively long times. If the set of resolved 
variables includes all the linearly unstable modes and the projection of the memory 
term is on an orthonormal set of functions of the resolved modes (finite-rank pro- 
jection), the agreement between the estimates of the conditional expectations and 
the true conditional expectations is good for short times only. If, even one linearly 
unstable mode is left unresolved, the agreement between the estimates and the true 
conditional expectations is good for short times only. 

2. Optimal prediction formalism and the short-memory approxima- 
tion. We present the optimal prediction formalism in full generality and derive from 
it the approximation that we will be using later, namely the short-memory approxi- 
mation. 

2.1. Conditional expectations and the Mori-Zwanzig formalism. Sup- 
pose we are given a system of ordinary differential equations 

with initial condition ^(0) = x, and we know only a fraction of the initial data, say 

X, where x = {x, x) and corrcspondigly (f> = {(j), (j)) and that the imresolved data are 
drawn from a measure with density f{x) (we will be working only with measures that 
are smooth enough to have a density). 

Suppose u, V are functions of x, and introduce the scalar product (u, v) = E[uv] = 
J u{x)v{x)f(x)dx. We will denote the space of functions u with E[u^\ < oo by i2(/) 
or simply L2- We are looking for approximations of functions of x by functions of 
X, where x are the variables that form our reduced system (the resolved degrees of 
freedom). The functions of x form a closed linear subspace of L2, which we denote 
by £2- Given a function urn L2, its conditional expectation with respect to x is 

} fdx 

The conditional expectation ii;[u|a;] has the following properties: 

• £'[u|:c] is a function of a;, 

• E[au + hv\x\ = aE[u\x\ + hE[v\x\, 

• E[u\x\ is the best approximation of u by a function of x: 

E[\u - E[u\xf] < E[\u - h{x)f] 

for all functions h. 

We can approximate the conditional expectation by picking a basis in L2, 

for example hi{x),h2{x), .... For simplicity assume that the basis functions hi{x) are 
orthonormal, i.e., E[hihj] = 5ij. The approximation of the conditional expectation 
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can be written as = '^ajhj{x), where aj — E[uhj\ = E[u{x,x)hj{x)\. If 

we have a finite number of terms only, we are projecting on a smaller subspace and 
the projection is called a finite-rank projection. In the special case where we pick as 
basis functions in L2 the functions hi{x) = cci, /i2(i) = 2^2, ■ ■ • , hm{x) = Xm, then the 
corresponding finite-rank projection is called in physics the "linear" projection (note 
that all projections are linear, so "linear" is used to denote that the projection is on 
linear functions of the resolved variables). We should note here that it is not always 
true that E[xiXj] = for i =/= j. 

The system of ordinary differential equations we are asked to solve can be trans- 
formed into the linear partial differential equation fSI 

ut = Lu, u{x,0) = g{x) (2-1) 

where L = Ri{x)-^ is the Liouvillian and the solution of l|2.1|l is given by u{x, t) = 
g{(j){x,t)). Consider the following initial condition for the PDE 

g{x) — Xj =^ u{x,t) ~ (j)j{x,t) 

We can rewrite the solution symbolically as 

u{x, t) = e^^Xj 

This implies that for the general case where u{x, 0) = g{x), the solution is 

u{x,t) = e*^5(x) = g(e*^x). 
Using the symbolic representation we can rewrite (|2.1(l as 

at 

and furthermore by using the identity Le^^ = e^^L (12.11) becomes 

^e*^a;, = e*^Lx,. (2.2) 

If P is any of the projections mentioned before and Q — I — P, (|2.2|) can be rewritten 
as H] 



d 



e*^Xj = e^^PLxj + e^Q^QLxj + [ e^^^"^^ PLe'^^^QLxjds, (2.3) 



where we have used Dyson's formula 



e 



tL ^ ^tQL ^ / e(*-")^PLe''3^ds. (2.4) 



Equation H2.3() is the Mori-Zwanzig identity |41[ I30L 1^ . Note that this relation is 
exact and is an alternative way of writing the original PDE. It is the starting point 
of our approximations. Of course, we have one such equation for each of the resolved 
variables (fij , j = 1, . . . ,m. The first term in H2.3|l is usually called Markovian since it 
depends only on the values of the variables at the current instant, the second is called 
"noise" and the third "memory". The meaning of the different terms appearing in 
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H2.3|l and a connection (and generalization) to the fluctuation-dissipation theorems of 
irreversible statistical mechanics can be found in [Sj . 
If we write 

e*^^QLxj = Wj, 

Wj{x,t) satisfies the equation 



j ^Wj{x,t) = QLwj{x,t) 

\ Wj {x, 0) = QLxj ^ Rj (x) - m-j (x). 

If we project (|2.5|l using any of the projections discussed we get 



d 

P—Wj{x,t) = PQLwj{x,t) = 0, 



(2.5) 



since PQ = 0. Also for the initial condition 

Pwj{x,0) = PQLxj = 

by the same argument. Thus, the solution of (|2.5() is at all times orthogonal to the 
space of functions of x. We call (|2.5|l the orthogonal dynamics equation. 

2.2. The short-time and short-memory approximations. The approxi- 
mation we will examine is a short-time approximation and consists of dropping the 
integral term in Dyson's formula (|2.4|l 



gtQL ^ gtL_ (2.6) 

In other words we replace the flow in the orthogonal complement of L2 with the flow 
induced by the full system operator L. Some algebra shows that 

Q(e-Qi_e^^) = 0(s2)^ (2.7) 

and 



f e^*-'''>^PLe''^^QLxjds = [ e^*""^^ PLQe'^QLxjds + 0{t^) 
Jo Jo 



(2.8) 



As expected dropping the integral term in Dyson's formula yields an approxi- 
mation that is good only for short times. However, under certain conditions this 
approximation can become valid for longer times. To see that consider the case where 
P is the finite-rank projection so 

I 

PLQe/'^^QLx, = ^(LQe^'^ig^^;, /ifc)/ifc(x), (2.9) 
and for the approximation 

I 

PLQe'^QLxj = Y^{LQe'^QLxj,hk)hk{x). (2.10) 

k=l 
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The quantities {LQe'^^QLxj, h/.) can be calculated from the fuU system without 
recourse to the orthogonal dynamics. Recall H2.7|l which states that the error in 
approximating e"^^ by e^^ is small for small s. This means that for short times we 
can infer the behavior of the quantity {LQe'^'^'^QLxj, hk) by examining the behavior 
of the quantity {LQe'^^QLxj,hk)- 

If the quantities [LQe^'^QLxj^hk) decay fast we can infer that the quantities 
{LQe'^'^^QLxj^hk) decay fast for short times. We cannot infer anything about the 
behavior of {LQe''^'"QLxj,hk) for larger times. However, if {LQe^''^^QLxj,hk) not 
only decay fast initially, but, also, stay small for larger times, then we expect our 
approximation to be valid for larger times. To see this consider again the integral 
term in the Mori-Zwanzig equation. We see that the integral does not extend from 
to t but only from t — to to t, where to is the time of decay of the quantities 
{LQe'Q^QLxj,hk). This means that our approximation becomes 

e(*'')^PLe'^^QLxjds ^ / e^^-'^^ PLQe'^'^^QLxjds 

Jt~t„ 

= f e''*~'''^^PLQe''^QLxjds+ f 0{s^)ds 

Jt-ta Jt-ta 
t 

e^*-'^^PLQe''^QLxjds + O{t^to). (2.11) 

--to 

From this we conclude that the short-time approximation is valid for large times if 
to is small and is called the short-memory approximation. On the other hand, if to 
is large, then the error is 0{t^) and the approximation is only valid for short times. 
Note that the validity of the short-memory approximation can only be checked after 
constructing it, since it is based on an assumption about the large time behavior of the 
unknown quantities {LQe'^^^QLxj, hk). Note, that determination of the quantities 
(LQe'^^^QLxj^hk) requires the (usually very expensive) solution of the orthogonal 
dynamics equation. The short-memory approximation, when valid, allows us to avoid 
the solution of the orthogonal dynamics equation. 

If the quantities {LQe^^QLxj , hk) do not decay fast, then we can infer, again 
only for short times, that the quantities {LQe^'^^QLxj, hk) of the exact Mori-Zwanzig 
equation do not decay fast. Yet, it is possible that the quantities {LQe^^^QLxj,hk) 
start decaying very fast after short times and remain small for longer times, so that 
the short-memory approximation could still hold. Of course, this can only be checked 
a posteriori, after the simulation of the short-memory approximation equations. 

In the statistical physics literature, the assumption that the correlations vanish 
for s ^ is often made which is a special case of the short-memory approximation 
with the correlations replaced by a delta-function multiplied by the integrals. We shall 
also comment on this drastic approximation for our problem later when we present the 
numerical simulations of the short-memory approximation equations. An application 
of the short-memory approximation can be found in 1 . 

3. Density estimation. Since the Kuramoto-Sivashinsky (KS) equation does 
not have a natural candidate for the measure on the space of solutions, we have 
to construct an approximation to this measure. We do so by means of maximum 
likelihood estimation. To find the maximum likelihood estimate of the parameters of 
the measure approximation we need to obtain samples of the solution of the equation. 
This is accomplished through the numerical solution of the equation for different 
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initial conditions sampled from a uniform distribution on the space of solutions. The 
measure constructed through maximum likelihood estimation is used later to compute 
the optimal prediction equations. 

3.1. Numerical solution of the KS equation. As mentioned earlier, we are 
looking at a Fourier-Galerkin truncation of the solution. We expand the solution in 
Fourier series and retain the first N terms, 



where Uk{t) are the time-dependent Fourier coefficients. The asymmetry between 
positive and negative wavenumbers comes from the fact that we will be working with 
an even number of modes. Since the solution of KS is real, U-k = u^, where, as 
before, * denotes complex conjugation. The Fourier expansion transforms Hl.l|l into 
a system of ordinary differential equations (ODE) which reads 

f-i 

= -y 2_/ Uk'Uk-k' + {k^ - vk'^)uk- (3.1) 

k — 2" 

The KS equation conserves the mean value ^ v{x)dx — uq. In all subsequent 
calculations we assume uq = without loss of generality. Linear stability analysis 
shows that the rate of growth for the mode fc is fc^ — lyk*. This makes the first 
(where [] stands for the integer part) modes linearly unstable. The mode [(2z/)~2] is 
the most unstable. Because of the presence of the fourth order stabilizing term, the 
stable modes have decay rates that increase rapidly with increasing wavenumber thus 
making the system stiff (see |221 for an extended discussion of stiffness). The presence 
of the nonlinear coupling does not alter the stiffness of the ODE system. It is this 
property precisely that dictates the need for (i) an implicit solver that guarantees 
stability and (ii) a solver with variable step size to control the errors. 

The method we chose for our numerical experiments is a variable-step variable- 
order multistep method based on the Backward Differentiation Formulas (BDF) that 
employs the Nordsieck representation for the storing of the necessary quantities at 
each instant I14j . The BDF methods are implicit multistep methods that, up to 
order 6, are stable. Since the method of integration is implicit it results in a system of 
nonlinear algebraic equations that has to be solved iteratively and we do that by the 
Newton-Raphson method. Finally, the convolution sums in the RHS of the equations 
are evaluated in real space using an FFT transform and the 3/2 rule for dealiasing 

Because we are using an even number of modes the mode — ^ equal to zero 

to ensure that the solution to KS is a real function. 

All our calculations were performed with the value of the viscosity = 0.085. 
This value was chosen because it belongs in the first window that supports chaotic 
solutions. The first 3 Fourier modes are linearly unstable and mode fc = 2 is the most 
unstable linearly. As suggested by the numerical simulations, the energy spectrum 
shows a pronounced peak at this mode. Experiments with different number of modes 
show that for > 24 the solution is converged for the time interval that we are 
interested in. For the experiments, an error tolerance of 10^^ per step was chosen. 
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Fig. 3.1. Evolution of the energy E = J^^ v'j^{x,t)dx. 



Simulations with stricter error tolerances up to 10"^" did not show any change in 
the behavior of the solutions, so we kept the value of 10~^ which allows for faster 
calculations. Strict tolerance criteria are dictated by the well-documented extreme 
sensitivity of the KS system to small errors ^Hl El- If high accuracy is not enforced 
it is possible, for some values of the viscosity that actually support chaotic solutions, 
to observe convergence to a steady state. On Fig. (|3.1|) we show the evolution of the 
L2 norm E = J^^ vj^{x,t)dx for a typical solution at this value of the viscosity. 
We denote the L2 norm by E and call it the energy of the system. To compute the 
maximum likelihood estimate we need to obtain a number of independent samples of 
the solution. This was done by starting the simulation from different random initial 
conditions, evolving up to time t = 5 and then recording the values of the Fourier 
modes. The time t = 5 was chosen on the basis of the statistics of the collected 
samples. These statistics resemble the behavior found in the literature [HHI for long- 
time asymptotics. The random initial conditions were picked uniformly in the range 
[—1,1] for all Fourier modes. We collected 10000 independent samples of the solution, 
and it is these samples that we used in the calculation of the maximum likelihood 
estimate. 

3.2. Maximum likelihood estimation. Let Wi, W2, . . . , w„ denote n indepen- 
dent samples of a random d-dimensional vector W with probability density function 
(p.d.f.) /(w, '4') where ^ is the vector of parameters that determine the p.d.f.. Let 



y = (wi , . . . , 




If we define the probability density ^(y, as 



n 



g(y,vI/) = []/(w„*), 



then the likelihood function formed from the observed data y is defined by 



i(*)=.9(y,*). 
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The vector ^ is to be estimated by maximizing the hkehhood |23 ■ An estimate 4f of 
^ can be obtained as a solution of the hkehhood equation ^g^-* = 0, or equivalently, 

dlogL(^) _ p. 

Our initial aim was to approximate the density on the space of the Fourier modes 
by a mixture of Gaussian densities. Application of the maximum likelihood method 
to obtain ^, results in equations that cannot be solved in closed form. However, this 
problem can be tackled by the application of iterative procedures like the Expectation- 
Maximization (EM) algorithm j^Hl • But the application of the EM algorithm in such 
a high dimensional setting as the one we have is not straightforward. Two problems 
appear The first is the problem of " overfitting" , where a small cluster of points, 
or even worse, a single point, determine the properties of one of the components, thus 
making the entries of the covariance matrix of the component acquire very small values 
that, in their turn, create numerical problems with the numerical implementation of 
the algorithm (we call such a component " narrow" ) . The second problem comes from 
the fact that, if the likelihood function has multiple maxima, the EM algorithm is 
only guaranteed to converge to a local maximum that depends on the initial values 
used for the parameters. This restricts the ability of the computed mixture to be 
applied to the estimation of the density of new data points. 

The KS equation is stiff and this creates an extra difficulty because the most 
stable modes of the system have very small variances. The normalization constant of a 
Gaussian ((47r)''|A|)~^ exp[— i(w — ^)*^^^(vir — ^) for d complex variables is (47r)''|yl|, 

where |^| = niLi(47''Ai) and Xi,...Xd are the eigenvalues of A. The very small 
variances for the most stable modes result in very small values for the corresponding 
eigenvalues, which in their turn make the normalization constant acquire very small 
values. This creates, again, probelms with the numerical implementation (see |35|^. 
since we have to deal with a "narrow" component as above. But, more importantly, 
it becomes increasingly difficult to decide, if a "narrow" component appears, whether 
it stems from overfitting or it should be there to describe the KS statistics. 

Different methods have been proposed to deal with the problems of overfitting 
and convergence to only local maxima for general Gaussian mixtures jHSOlEZl- We 
shall adopt here a more modest goal and try to approximate the density with only one 
Gaussian. Of course, for the case of only one Gaussian component, we can compute 
the properties of the Gaussian by direct maximization of the likelihood function. 
However, our future goal is to use a mixture of Gaussian components to describe 
more accurately the density of the Fourier modes. 

Before we estimate the parameters of the Gaussian density we make an additional 
simplification dictated by the statistical properties of the collected samples. The 
covariance matrix for the Fourier modes based on the collected samples is 

1 " 

Cov[wi,w*] = Y ^{wkt - Mi){wkj - MjY , 

k=l 

for i,j = l,...,d where Mi,i = are the means Mi = ^X]fe=i^fei- Our 

numerical experiments indicate that the covariance matrix for the KS system is almost 
diagonal, i.e. the non-diagonal entries are very small compared to the diagonal ones 
(results from "ST also support this conclusion). We discard the off-diagonal elements 
and work with a Gaussian density with a diagonal matrix. 

The fact that the solution of the KS equation is real induces a constraint on 
the values of the Fourier modes, namely for each wavenumber k we have, U-k = u^. 
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We can incorporate the constraint for the Fourier modes in the expression for the 
Gaussian density. If we order the Fourier modes so that the first | positions of the 
vector w of modes are occupied by the modes with positive wavenumbers 1, . . . , ^ 
and the rest | positions by the modes with negative wavenumbers — 1, . . . , — |, we 
can write the Gaussian density (with the constraint) as 

exp[-i(w - Ai)*A-i(w - /i)] f[ 5{w,^. ~ <), (3.2) 

i=l 

where A = diag{ai, . . . , ad)- The entries of the matrix A have the property a^^d = Ui 
for 1 = 1,...,^, since they are the variances of the modes Ui and u^_^d which according 
to our ordering are complex conjugates of each other. Also, H^^d = //* for i = 1, . . . , |. 

d 

The normalization constant Z (modified due to the constraint) is Z — Y[i=ii'^''^^i)- 

The numerical results of the simulations for the KS equation are converged with 
iV = 24 modes. However, because the zero mode (i.e. the mean value of the real-space 
solution) is constant and taken equal to zero and the — -yth mode was set equal to 
zero as explained above, the effective number of modes is = 22. Thus the dimension 
d for the Gaussian density is 22. The formulas for the estimation of the parameters 
of the Gaussian density are 

1 " 

^j.i = -y^Wjt, (3.3) 
1 " 

An = - ^{wji - y.i){wji - Hi)*, (3.4) 

for i = l,...,d. In order to check how well the Gaussian describes the statistics 
of the collection of samples, we computed the first 4 moments (mean, covariance, 
skewness and flatness) of the samples. The Gaussian density determined by maximum 
likelihood reproduces very accurately the first three moments of the samples. The 
third moment of a Gaussian density is identically zero and thus it means that the 
distribution of a Fourier mode is symmetric around its mean value. The KS dynamics 
become evident in the fourth moment. The 4th order experimental statistics show 
a strong deviation from Gaussianity that manifests itself as negative flatness for the 
most unstable modes and positive flatness for the most stable modes. A negative 
flatness for the most unstable modes suggests the existence of large weights for values 
close to the mean value. Such large weights in the p.d.f. of the most unstable modes 
have been related (|^) to the appearance of coherent structures at these scales. On 
the other hand, a positive flatness for the most stable modes suggests the existence of 
large weights at the tails of the p.d.f.. They have been related to the appearance of 
rare bursts of activity at these scales. These rare bursts of activity are the space-time 
defects (creation and annihilation of basic cell-structures) mentioned earlier. 

The fact that a single Gaussian density cannot model accurately all the moments 
means that this density is not invariant for the equation. This will result in a complica- 
tion for the formulas of the optimal prediction equations. If the density was invariant 
and we used the finite-rank projection we could have a fluctuation-dissipation rela- 
tion |35| which would yield significantly simpler expressions for certain terms of the 
optimal prediction equations. 
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Fig. 3.2. Log-log plot of the variances for different Fourier modes as computed by 10000 Monte- 
Carlo samples and as predicted by the Gaussian density. 

The Gaussian density constructed through maximum hkehhood can be sampled 
via Monte-Carlo |2], or by producing independent samples of Gaussian distributions 
of the appropriate variance for each of the Fourier modes (e.g. using the Box-Mueller 
algorithm) [201 ■ The results can be seen in Fig. (|3.2|) . 

4. Optimal prediction for Kuramoto-Sivashinsky. We proceed to construct 
the short-memory approximation optimal prediction equations for KS. To conform 
with the optimal prediction formalism we set 



i?fc(u) 



and we get 



duk , , 



(4.1) 



for k = — — 1. The system of equations is supplemented with the initial 
condition u(0) = Uq. The initial condition can be written as Uq = (uo,Uo), where 
Uq is the vector of resolved variables and Uq the vector of unresolved ones. Recall, 
that the KS equation conserves the zero mode uq which we have set to zero for our 
numerical experiments. Also, the mode is set to zero to preserve the reality 

of the solution of KS. This makes the effective number of modes for the truncation 
equal to N — 2. This {N — 2)-dimensional vector is the vector of all the modes, 
resolved and unresolved. We set n = N — 2 to conform with the optimal prediction 
formalism. In what follows we will refer to the system of n Fourier modes as the full 
system. The vector of resolved variables will be a susbset of the n-dimensional vector 
of Fourier modes containing m modes. The unresolved modes are the rest n—m modes. 
The m-dimensional vector of resolved modes contains ^ positive wavenumber modes 
and their corresponding ^ negative wavenumber modes. Accordingly, the vector of 
unresolved modes contains positive wavenumber modes and their corresponding 
negative wavenumber modes. We do not specify which Fourier modes will be 
resolved since in our numerical experiments we use different sets of Fourier modes 



12 



P. STINIS 



as resolved. By relabeling the resolved Fourier modes we can order them from 1 to 
m and thus we can construct the optimal prediction equations without any explicit 
reference to the set of resolved modes. 

The equations for the evolution of the set of m resolved variables in the short- 
memory approximation are 

d /"* 

T-e^^uoj = e^^PLuoj + e'^^QLuoj + / e^^'"^^ PLQe'^QLuojds, (4.2) 
ot Jq 

where j = 1, . . . , m, L = J2'i=i ^^(^0)9^^ and uoj = Mj(0). Since PLuqj = 9^j(uo), 
Luoj = Rj (uq) we have 

QLuoj - i?j(uo) - 9^,(uo). (4.3) 

4.1. Markovian term. We proceed with the calculation of the first term Piuoj = 
Dlj{uo) in (|4.2|l . For the case of a Gaussian density the conditional expectations can 
be computed exphcitly fTT|. 

Let G be the m x n matrix that has the property Guq = Uo- For the diagonal 
Gaussian density we find 

m 

E[uQi\uQ] = ^ QtkUok + Ci (4.4) 
fc=i 

for i = 1, . . . ,n where qik are the entries of the n x m matrix Q and Ci are the entries 
of the n-vector c given respectively by 

Q ^ {CG*){GCG*)-^ (4.5) 
c = fi- {GG*){GGG*)-\Gn), (4.6) 

where G is the covariance matrix of the Gaussian density with entries Cji — E[[uQj — 
fJ-j)iuQi — fJ-i)*] — Aji and /i is the vector of expectation values. Note that in the case 
of a diagonal Gaussian density, the entries qik for i — m -\- 1, . . . ,n and k = 1, . . . , m 
are zero. Hence, the conditional expectations of the unresolved modes conditioned on 
the resolved ones are equal to their unconditional expectations. This is as expected, 
because if the Gaussian is diagonal, the modes are independent from each other. 
The conditional covariance matrix has entries 

Cov[uQ„Uoj]i,g = E[uoiUQj\uo] - E[uq,\uq]E[u'^j\uo] = [C - QGC]^, (4.7) 

for i,j — 1, . . . ,n. For the case of a diagonal Gaussian density, the entries of the 
conditional covariance matrix corresponding to interactions among resolved modes 
and their interaction with the unresolved modes are zero, for the same reasons ex- 
plained above. Also, the entries of the covariance matrix for the interactions among 
unresolved modes are equal to the entries of the unconditional covariance matrix. 

Wick's theorem holds for conditional expectations too JT]. Since we are dealing 
with complex variables, higher moments than the second consist of products of cen- 
tered variables woi — i?[uoi|uo] that have to be defined in a way that for even moments 
the resulting product is a positive number. For this reason we define the variable 
Wi = uoi or Wi = depending on the product and find 

^r-TT , . 1 I P odd, , , „, 

o OD [Wi J , J uo ■ • • L' ov [Wi^ , w^^ J uo even. 
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where the summation is over ah the possible pairings of the P indices and Vp = 
Wp - E[wp\uo]. 

The nonhnear sum in 14.1|l can be decomposed into 3 parts as 

^ M<iW<2 + X! ^<i^>2 + X! ">i">2' (4-9) 

<1,<2 <1,>2 >1,>2 

where X]<i <2 indicates that we sum over ah pairs of indices where both indices 
belong in the resolved range i — l,...,m, ^ indicates summation over all 

pairs of indices where only one index is in the resolved range and ^ indicates 
summation over the pairs where both indices are outside the resolved range. 

Using the decomposition above can facilitate the computation of the conditional 
expectations. We have 

^[^j(uo)|uo] = wo<iUo<2-y wo<i-B[uo>2|uo] (4.10) 

<1,<2 <1,>2 

--^ Y E[uo>,uoy^\uo] + if - i^j'^)uoj 

>1,>2 

For the quantity £^[wo>2|uo] we have 

m 

i;[uo>2|uo] = ^g>2,i'«o/ + c>2, (4.11) 
1=1 

while for the quantity _E[uo>iWo>2 |uo] we have 

S[uo>iMo>2|uo] = £^[uo>iUo_>Juo] 

= [C-QGC]o>„o->2 +SK>Juo]SK->2|uo]. (4.12) 

The expressions H4.1()ll . (|4.11(l and (|4.12|l determine the Markovian term. If we use a 
diagonal Gaussian density, some of the terms in these expressions become zero. How- 
ever, we kept the most general form for reasons of completeness of the presentation. 

4.2. Noise term. The second term in (|4.2|l is the noise term QLuoj, and we 
proceed with its calculation. From l|4.3ll . (|4.9|l and (|4.1U|) we have 

QLuoj = Y "o<i('"o>2 - -E['uo>2|uo]) (4.13) 

<1,>2 
>1,>2 

= Aj(uo) + Bj(uo), 

where 

^i(uo) = ~y X! ^o<i(mo>2 - E[uo>^\uo]) 

<1,>2 

and 

Bj{no) = Y ('"o>iUo>2 - ■E^[uo>iMo>2|uo])- 

>1:>2 
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Defining similarly Aj{u(uQ, s)) and Bj{u{uQ, s)) such that 

e'^QLuQj = Aj{u{uo, s)) + Sj(u(uo, s)) 
we have in the short-memory approximation 

^e*^zio,=e*^5H,(uo) (4.14) 

+ ^j(u(uo,i)) + / e(*-'')^PLQylj(u(uo,s))ds 
Jo 

+ Bj(u(uo,t)) + / e^'-'^^PLQBjiu{uo,s))ds 
Jo 

From (|4.1^^|l and (|4.14|l we see that each one of the sums X]<i >, ^^'^ S>i >, make a 
contribution to the memory and the noise. We should mention here that the emergence 
and significance of multiplicative noise terms like the one coming from the sum X]<i >2 
have been also discussed in the context of the stochastic modelling reduction strategy 
of Majda et al. 25, .26 • 

For the choices that we will make for the set of resolved variables, we can safely 
remove the contribution to the memory and the noise that comes from X]>i >2 ^ 
small due to the stiffness of the system; the stiffness of the KS system is due to 
the large ratio of growth rates between the small wavenumber modes and the large 
wavenumber modes that are present in the solution. In all the numerical experiments, 
the set of resolved variables includes mostly small wavenumber modes, thus the bulk of 
the unresolved modes are large wavenumber modes. These large wavenumber modes 
have small magnitude and thus the contribution of their interaction to the memory 
and noise through the term X]>i >2 ^^^^ small and thus can be neglected. However, 
recall that we retain the contribution of ^ to the Markovian (first) term £Hj (uq) . 

The expression for ylj(u(uo,t)) involves quantities that are unknown, namely the 
modes u>2(t). We will approximate these modes as stochastic processes with known 
mean and autocorrelation by using the moving average method 1271 Kffll |2] . Due 
to the fact that we are using a density that is non- invariant, the evolution of all the 
Fourier modes, and of the unresolved in particular, does not constitute a stationary 
stochastic process. However, for our numerical experiments we will treat the unre- 
solved modes' evolution as a stationary stochastic process and use their statistics with 
respect to the initial density for the moving average method simulations. We defer 
the application of more sophisticated sample path generating methods suitable for 
nonstationary processes for future work. 

Let u(t, Lo) be a wide sense stationary stochastic process. If its covariance 

R{ti,t2) = E[{u{t2, Lo) - m{t2)){u{ti,u;) - m{ti))*] 

is written as R{ti,t2) = J e*'^''*^~*i-'(iF(fc) for some non-decreasing function F{k) 
(through Khinchin's theorem) and also F{k) is such that dF{k) = (f)(k)dk, then 

u{t,uj) = J h*{s-t)p{ds) (4.15) 

where the function h{t) is the inverse Fourier transform of h{k) — ^ 4>{k) and also, 
i?[|p((is)p] = ds. Note that the random measure p constructed as increments of Brow- 
nian motion at instants ds apart has this property. Thus, any wide-sense stationary 
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stochastic process with dF{k) — (j}{k)dk can be approximated as a sum of trans- 
lates (in time) of a fixed function, each translate multipUed by independent Gaussian 
random variables. This is the "moving average" representation. 

Having developed the formalism for the moving average representation, we now 
derive formulas for its numerical implementation. A possible approach would be to 
discretize the expression (14.15(1 in intervals of length Ai and use a midpoint rule to 
find 

n+j 

n{t,)= J2 h*iU~t,)p, (4.16) 

i— — n-}-j 

with = At and tj — jAt. (j4.16() can be rewritten (by the change of variables 

i' — i — j and dropping the primes) as 

n 

<h) = E h*{U)p^+J (4.17) 

i— — n 

and in effect the velocity at tj is the dot product of a vector of values of h and 
a random vector. Recalling the remark above about the increments of Brownian 
motion, we see that the process u can be represented as a sum of suitably weighted 
Gaussian independent random variables with mean zero and variance At. In the 
numerical experiments, we split each unresolved mode in its real and imaginary parts 
which are approximated independently. 

4.3. Memory term. After computing the Markovian and noise terms, we pro- 
ceed with the calculation of the memory term 

I e^*-''^^PLQAj{u{uQ,s))ds. (4.18) 

Note that as in the case of the noise we only use the part of the memory that comes 
from the y terms in the convolution sums. We will use two different projections 
P, namely the linear and the finite-rank one. The conditional expectation projection 
will not be used because lack of explicit expressions make it computationally very 
expensive. 

4.3.1. Linear projection. For a function .g(uo) the linear projection is 

rn 

(F9)(uo) = b-j\9,uo,)uo, (4.19) 

where 6~-^ are the entries of a matrix whose inverse has entries bij = {uQi,uoj)* and 
the inner product is defined through the density f{x) (Gaussian in our case) as 

/ + 00 
g{x)h* [x) f {x)dxdx* . 
-oo 

Using (|4.19|) we have 

e(*"")^PLQAj (u(uo, s))ds = (4.20) 



/ e(*-"'^ V h-^{LQAj{vL{uQ,s)),uoi)uQkds. 

•^0 r.k^l 
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What remains to be done is to compute the quantity LQAj{u{uo, s)). Some algebra 
and use of the relation 

"9 " OA 

V ii, (uo ) -— A, (u(uo , s) ) = ^ ii, (u(uo , s) ) ( — ^ ) (u(uo , s) ) , 

gives 

LQAj{u{uo,s))= (4.21) 

" dA 

^i?,(u(uo,s))(^)(u(uo,s))- b-^\Aj{u{xio,s)),uoi)Rk{uo). 

1=1 i,k=l 

Then the quantities {LQAj{u{ui), ,s)), ugi) can be computed by sampling the density 
via Monte Carlo, evolving the full system and averaging. 

4.3.2. Finite-rank projection. The finite-rank projection of a function g{uo) 
on a finite number of terms of an orthonormal set of functions /ii(uo), /i2(uo), ... is 

I 

{Pg){ixo) =Y,{9,hi)hi, (4.22) 

i=l 

where the inner product is defined as before and {hi,hj) = Sij. Due to the fact 
that we are using a complex Gaussian density we can define an orthonormal set of 
functions by suitably modifying one-dimensional Hcrmitc polynomials for each of the 
m resolved variables. Let Uq = {uqi, ■ ■ ■ ,uom) be the vector of initial values of the 
resolved variables, where uoj = zji + izj2- Define the 2m-dimensional real vectors 

Z = {zu,Zi2, ■ ■ ■ , Zml, Zm2) 

and 

1^ = (/^llj Ml2i • • • , Mml, Mm2)- 

We need to split each mode in its real and imaginary parts because of the definition 
of the orthonormal set below. 

We define the multi-index k= (ki, . . . , Km) and the set of functions 

h^{z) = f[[H,Jl{z,,-fij^))+iH,^J^{zj2-fij2))], (4.23) 

■j^ y j y j 



where 

H^^ix) = ■l={l + 2(3^f^H^^{{l + 2f3^f^x)e-'-^^\ (4.24) 
The functions iJ^, are Hermite polynomials (with weight exp(— ix^)) satisfying 



Ho{x) = l, Hi{x)=x, Hk{x)^^xHk-i{x)^^j^-^Hk-2[x) (4.25) 

and P^. > — i. The derivatives of the functions H^jix) can be computed by the 
recursive relation 



d_ 
dx 



H^^{x) = {l + 2p,^)^H,._,{x) - p^^xH^^ix). (4.26) 



OPTIMAL PREDICTION FOR THE KURAMOTO- SIVASHINSKY EQUATION 



17 



To ensure the orthonormality of the above set of functions, we have to impose a 
constraint on the values of the entries of the muhi-index k. The constraint is, that 
the entries corresponding to a positive wavenumber and its corresponding negative 
wavenumber cannot be simuhaneously different from zero. We, also, have to enforce 
the constraint /J^j = when H^^. is of order zero. These constraints arise from the 
fact that the solution of Kuramoto-Sivashinsky equation is real thus the fc-th negative 
Fourier mode is the complex conjugate of the fc-th positive mode. 
The memory term becomes 

f e(*-")^PLQAj (u(uo, s))ds ^ 
Jo 

^(iQ^,(u(uo, 5)), /i''(uo))/^"(uo)rfs, 

Kei 



where 

LQAj{u{uo,s)) = 

V i?z(u(uo, s))(^)(u(uo, s)) 

2 — 1 

- f2 i?.(uo) ^(A,(u(uo, .)), /i-(uo))^^^ 

and / denotes the set of m-tuples of indices used in the finite-rank projection. 

Thus, we see that starting from the short-memory approximation we have trans- 
formed the original system of ODEs into a system of random ordinary integrodiffer- 
ential equations. Such an approach was adopted in pp and termed stochastic optimal 
prediction. In the form of the projection coefficients allowed the reduction of 
the integrodifferential equations to differential equations (this is the delta-function 
approximation dicussed earlier). 

5. Numerical simulations. We use the short-memory approximation equa- 
tions for the KS system to compute the evolution of the conditional expectations of 
a set of resolved Fourier modes conditioned on their initial values. We conducted 
numerical experiments for two sets of resolved variables. The first set includes all the 
modes that are linearly unstable (for the value of the viscosity used). The second 
set includes all but one linearly unstable modes. The error exhibited by the short- 
memory approximation is strikingly different for these two sets of resolved variables 
and we offer an explanation for this difference. 

5.1. Resolution of all unstable modes. We present the results of the short- 
memory approximation for the first set of resolved variables that includes all the 
unstable modes. 

5.1.1. Linear projection. In order to construct the short-memory approxima- 
tion equations we have to compute the first (Markovian), second (noise) and third 
(memory) terms in the RHS of the short-memory approximation equation. The 
Markovian term can be computed explicitly, while the noise and memory terms rely 
on expressions that have to be computed through simulations of the full system. 

For the value of viscosity v = 0.085 used in our experiments, a truncation retaining 
the first = 24 modes is enough to fully resolve the system. Since two of the modes 
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Fig. 5.1. Autocorrelation for the unresolved mode with wavenumber 6 for different initial times, 
a) Real part, b) Imaginary part. 



are set to zero, the effective number of modes is n = iV — 2 = 22. The n-dimensional 
system is called the full system. We assume that initially we know the values of only m 
of the n modes, while the values of the rest n — m modes are drawn from the diagonal 
Gaussian density H3.2I) . The m resolved modes constitute the reduced system. We 
examine the case where we know initially only y = 12 variables. However, for the 
same reasons as in the case of the full system, the zeroth and -6th mode are set to 
zero, and the effective number of modes for the reduced system is 12 — 2 = 10. We 
set TO = 10 in all subsequent calculations. 

For the first set of resolved modes, the reduced system includes all the modes 
that are linearly unstable. For our case there are three positive and three neg- 
ative wavenumber modes that are linearly unstable, namely the modes for k = 
— 3, — 2, — 1, 1, 2, 3. The rest 10 — 6 = 4 modes in the resolved set are linearly sta- 
ble modes, namely the modes with k = —5, —4, 4, 5. 

We proceed with the presentation of the calculations for the quantities needed for 
the evaluation of the noise term Aj{u{uo,t)). The noise term involves the unknown 
quantities M>2(i)- Because we are using a density that is not invariant, the quantities 
u>2{t) are nonstationary stochastic processes. Fig. (|5.1() shows the autocorrelation for 
the unresolved mode with wavenumber 6 



for different initial times ti. As explained before, we approximate the quantities Uy2{t) 
as stationary stochastic processes with mean and autocorrelation determined with 
respect to the diagonal Gaussian density. We use the moving average method to 
sample these stochastic processes. Each of the unknown Fourier modes is split into 
its real and imaginary parts which are approximated independently. Fig. H5.2|l shows 
the autocorrelation of the real and imaginary parts for the unresolved mode with 
wavenumber 6 as computed by the full system and the moving average method. The 
moving average method estimates for the autocorrelations of the unresolved modes 
were produced by averaging over 10000 sample paths. 

After the properties of the noise term, we have to calculate the properties of the 
memory term. The quantities needed for the memory term evaluation were computed 
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Fig. 5.2. Comparison of the autocorrelation for the unresolved mode with wavenumber 6 as 
computed from the full system and the moving average method, a) Real part; b) Imaginary part. 
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Fig. 5.3. Projection coefficient of the memory term for the equation for the resolved mode 1 
on the resolved mode 4 for the first set of variables, a) Real part, b) Imaginary part. 



averaging over 10000 samples. For the case of the hnear projection, we have to 
compute the term (recall (|4.20|l ) 



Jo 



uoi)uQkds, 



i,k=l 



SO we need to compute the projection coefficients 6j^^(LQv4j(u(uo, s)), uoi). 

Fig. H5.3|l shows the evolution of the projection coefficient of the memory term for 
the equation for the resolved mode with wavenumber 1 on the resolved mode 4 
'Y^™' b~^{LQAi{\i{\iQ,s)),Uf)i) (from now on we omit the word wavenumber for aes- 
thetic reasons). 

Inspection of the behavior of all the projection coefficients needed in the memory 
term determination reveals that the coefficients decay fast in time. The fast decay of 
the quantities {LQe'^^QLuj, Uk), i.e. of the projection coefficients in the short-memory 
approximation, can only guarantee the fast decay of the quantities {LQe^'^^QLuj, Uk) 
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of the exact Mori-Zwanzig equation for short times. For large times, the quantities 
{LQe'^'^^QLuj, w^) can behave very differently from {LQe'^^QLuj, Uf^). Thus, we have 
to truncate the interval of integration for the integral term from [0, t] to [t — to, t] (for 
t > to). The time to is a short time over which the error of using {LQe^'^QLuj, Uk) in- 
stead of {LQe'''^'"QLuj,Uk) is not large. Of course, we do not know the value of to be- 
forehand, since we do not know the quantities {LQe^'^^QLuj, Uk). However, motivated 
by the numerically determined form of the projection coefficients [LQe^^QLuj^Uk), 
we truncate the interval of integration for the memory term from [0, t] to [t — 
(for t > 1), since most projection coefficients decay significantly in a time interval of 
length 1. 

The determination of the properties of the noise and the memory allows us to solve 
the short-memory approximation equations for the resolved modes. Although the full 
system is stiff, since it includes many stable modes with very large decay rates, the 
stable modes contained in the reduced system have small decay rates. This allows us 
to use an explicit solver for the reduced system. For our numerical experiments we use 
the Runge-Kutta 4th order method to solve the random integrodifferential equations 
of the short-memory approximation. For the calculation of the integral memory term 
we used two different methods, namely the trapezoidal rule and Simpson's rule. Most 
experiments were done twice, once with each method. 

The quantities used in the noise term have very fast decaying autocorrelations, 
which result in a rough (in time) noise. This, in turn, can create problems with the 
numerical stability of the explicit solver. However, the noise has small magnitude and 
does not appear to create problems in the actual implementation. 

The system of random integrodifferential equations is solved over and over for 
different realizations of the noise. The initial condition used for the resolved modes 
with wavenumbers — 3, — 2, — 1, 1, 2, 3 is ui = u_i = U2 = u-2 = ^3 = u_3 = 1, 
while the rest of the resolved modes are set initially equal to zero. The conditional 
expectations of the resolved modes conditioned on their initial values are computed 
by averaging over the realizations of the noise. 

The truth is computed as follows. We sample the density over and over keeping 
the resolved modes fixed to the values just described. For each sample, we evolve (13.111 
for /c = — Finally, we average over the samples and obtain the conditional 
expectations of the resolved modes. 

The Galerkin approximation consists of setting the unresolved modes equal to 
zero for all times, i.e. solving H3.1|l for k = —5, . . . , 5. For the Galerkin approximation 
there is no need to solve the reduced system repeatedly, because for each realization 
the unresolved modes are set equal to zero. 

Figs. H5.4|l . (|5.5|) . H5.5|l show the real and imaginary parts of the short-memory 
approximation estimates of the conditional expectations for the resolved modes 1,2,3 
as compared to the truth and the Galerkin approximation. The true conditional 
expectations were computed by averaging over 1000 samples. The short-memory 
estimates were computed by averaging over 1000 realizations of the noise. The results 
for the truth and the optimal prediction estimates are converged, even with a 1000 
samples, because, the unresolved modes have small magnitudes. However, there is no 
"slaving" ^31 of the unresolved modes to the resolved ones. The randomness in the 
initial conditions of the unresolved modes does affect the conditional expectations of 
the resolved ones. However, in the time interval that we examine, the results of the 
randomness of the initial conditions have not yet fully manifested themselves, and 
thus 1000 samples (or 1000 realizations of the noise) are enough for convergence of 
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Fig. 5.4. Conditional expectation evolution for the resolved mode 1 for the first set of resolved 
variables, a) Real part, b) Imaginary part. 
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Fig. 5.5. Conditional expectation evolution for the resolved mode 2 for the first set of resolved 
variables, a) Real part, b) Imaginary part. 

the results. For longer times, more samples (or realizations of the noise) are needed 
for convergence of the truth and the optimal prediction estimates. 

As can be seen, there is good agreement between the short-memory approxima- 
tion estimates for the conditional expectations of the resolved modes and the true 
conditional expectations of these modes. Also, the short-memory approximation re- 
sults constitute a considerable improvement over the results for the Galerkin ap- 
proximation. The good agreement of the short-memory estimates for the conditional 
expectations of the resolved modes, indicates that the fast decay of the projection 
coefficients {LQe^'^^QLuj,Uk) is not restricted to short times only. On the contrary, 
they must continue to decay fast for larger times and this allows the short-memory 
approximation to perform well. 

Motivated by the fact that all the projection coefficients [LQe^'^QLuj^Uk) used 
in the short-memory approximation decay fast, we can inquire about the validity of an 
even more drastic approximation which replaces the projection coefficients by delta- 
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Fig. 5.6. Conditional expectation evolution for the resolved mode 3 for the first set of resolved 
variables, a) Real part, b) Imaginary part. 



functions multiplied by the integrals of the quantities {LQe^^QLuj,Uk). Of course, 
as before, these integrals must be computed over a truncated interval. The advan- 
tage of such an approximation is that the reduced system is a system of random 
ordinary differential equations and not integrodifferential equations. This results in 
a significant reduction of the computational time needed to calculate the estimates 
of the conditional expectations of the resolved modes. Fig. (|5.7|l show the estimate 
of the conditional expectation for the resolved mode 1 using the delta-function ap- 
proximation, compared to the results of the short-memory approximation and the 
truth. The delta-function approximation can predict accurately the evolution of the 
conditional expectation for shorter times than the short-memory approximation. The 
inadequacy of the delta-function approximation to predict the evolution of the condi- 
tional expectations for longer times can be explained by a more careful inspection of 
the projection coefficients {LQe^'^QLuj^Uk)- Although these coefficients decay fast, 
several of them do not retain the same sign during this decay (see e.g. Fig. (|5.8|l ). 
In fact, they oscillate rapidly with non-negligible amplitudes. Integration of these 
oscillations results in cancellations that lead to loss of important information about 
the short-time memory. As a result, the delta-function approximation loses accuracy. 
However, the accuracy exhibited by the delta-function approximation is impressive, 
if we consider that it results in a system of differential (and not integrodifferential) 
differential equations whose numerical integration is very fast. For our numerical ex- 
periments, the integration of the delta- function approximation equations is 15 times 
faster than the integration of the short-memory approximation equations. In fact, 
the time of integration, for one realization of the noise, of the delta- function approx- 
imation equations, is comparable to the time of integration of the equations for the 
simple Gaierkin approximation. 

5.1.2. Finite-rank projection. We conclude our presentation of results for the 
first set of variables with the estimates of the conditional expectations obtained when 
we use the finite-rank projection in the memory term calculations. 

Fig. (|5.9|l shows the real and imaginary parts of the finite-rank projection short- 
memory approximation estimates of the conditional expectations for the resolved 
mode 1, as compared to the truth and the linear projection short-memory approxi- 
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Fig. 5.7. Conditional expectation evolution for the resolved mode 1 for the first set of resolved 
variables. Delta-function vs short-memory, a) Real part, b) Imaginary part. 
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Fig. 5.8. Imaginary part of the projection coefficient of the memory term for the equation for 
the resolved mode 5 on the resolved mode 1 for the first set of resolved variables. 



mation estimates. The finite-rank projection short-memory approximation estimates 
were computed by averaging over 1000 reaUzations of the noise. For our numerical 
experiments we use the value = for all . , irrespective of their order (see 
14.2314.25)) . Note that when (3^. — 0, the first order polynomials in the set are the 
functions used in the linear projection (with a different numerical factor coming from 
the orthonormality of the set functions). The finite-rank projection estimates are 
computed using polynomials of order up to 2 for the two most unstable modes. All 
the constraints mentioned above result in a set of 30 orthonormal functions (we also 
omit the function that consists of polynomials of order zero in each mode, which is a 
constant). The interval of integration of the memory term was truncated from [0, i] 
to \t — l,t]. Different truncations did not alter the results much. 

The finite-rank projection estimates of the conditional expectations are worse 
than the linear projection ones. This result, although surprising at first, can be ex- 
plained by examining the assumption under which the short-memory approximation 
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Fig. 5.10. Examples of slowly- decaying projection coefficients of the memory term on polyno- 
mials of order higher than 1. Finite-rank projection for the first set of resolved variables. 

is expected to be valid for long times. The short-memory approximation is valid if 

the quantities {LQe'^^'^QLuj,hk) decay fast and stay small for long times. If they 
do not, then the short- memory approximation is valid for short times only. Also, 
the behavior of the quantities {LQe'^^QLuj,hk) can only be used to infer the be- 
havior of {LQe'^^^QLuj^ hk) for short times. Inspection of the projection coefficients 
{LQe'^^QLuj, hk) for the case of the finite-rank projection reveals that several of these 
coefficients, do not decay fast. This means that, at least for short times, the quanti- 
ties [LQe^'^^QLuj, hk) do not decay fast. Since the conditional expectation estimates 
based on the finite-rank projection arc not good for longer times, we conclude that 
the quantities {LQe^'^'^^QLuj, hk) do not start decaying fast after a short time. This 
is not contradictory with the notion that the finitc^-rank projection is a more accu- 
rate projection than the linear one. Indeed, the finite rank projection is better than 
the linear one for the memory term of the exact Mori-Zwanzig equation and for the 
memory term of the short-memory approximation. However, being a more accurate 
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projection, does not necessarily guarantee that the projection coefficients on polyno- 
mials of order higher than 1 (recall (3^,^ = 0) have to decay fast and stay small for 
long times. This is exactly what happens in our case. Several projection coefhcients 
[LQe^'^^QLuj^hk) on polynomials of order higher than 1 do not decay fast and do 
not stay small for long times, thus violating the short-memory approximation assump- 
tion. In this way, the short-memory approximation for the linear projection case that 
does not include those slowly-decaying coefficients, is better than the short-memory 
approximation for the finite-rank projection that docs include those slowly-decaying 
projection coefficients. Fig. (|5. 101) shows some examples of slowly decaying projection 
coefficients on polynomials of order larger than 1. 

5.2. Resolution of all but one unstable modes. For the second set of re- 
solved modes, we include in the resolved modes all the linearly unstable modes except 
for one. We choose to leave the modes fc = — 1, 1 as unresolved. There is no particular 
reason for this choice. Any other unstable mode can be left as unresolved, since it is 
the instability, and not the growth rate, of an unresolved mode that determines the 
accuracy of the short-memory approximation. In fact, for our value of the viscosity, 
mode 1 has the smallest (linear) growth rate among the unstable modes. With this 
choice, the set of resolved modes contains the unstable modes k = —3,-2,2,3 and 
the stable modes k = -6,-5,-4,4,5,6. As in the case of the first set of resolved 
variables, the full system has n = 22 modes and the reduced system m = 10 modes. 
For the second set of resolved variables, we only conducted numerical experiments 
where the linear projection was used for the memory term. 

Fig. H5.11|l shows the typical behavior of the projection coefficients. As is evident, 
the projection coefficients (LQe^^QLujjUk) of the short-memory approximation do 
not decay fast. There are still some coefficients that decay fast, but the majority of the 
projection coefficients do not and those determine the accuracy of the conditional ex- 
pectation estimate. We only know (recall H2.7|l ) that the coefficients {LQe'^^QLuj, Uk) 
approximate well the quantities {LQe'^^^QLuj, Uk) for short times. Even though the 
coefficients {LQe'^^QLuj,Uk) of the short-memory approximation do not decay fast, 
the coefficients {LQe^^^QLuj, ut) can very well start decaying fast after a short time. 
However, we don't know that before implementing the short-time approximation. By 
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Fig. 5.12. Conditional expectation evolution for the resolved mode 2 for the second set of 
resolved variables, a) Real part, b) Imaginary part. 

the same reasoning as in the case of resolution of all unstable modes, we have to 
truncate the interval of integration for the memory term, from [0,t] to [t — t^, t] (for 
t > to). Again, to is a short time over which the error of using [LQe^^QLuj, Uk) in- 
stead of {LQe'^^^QLuj^Uk) is not large. Of course, we do not know the value of ig. In 
the case of resolution of all unstable modes, we guessed that the interval to should be 
close to the time needed for the projection coefficients to decay significantly. But, in 
the present case, where the projection coefficients do not decay fast, we cannot make 
any guess. We conducted numerical experiments with different truncated intervals 
of integration for the memory term and the results do not change much. We present 
results for the case where the interval of integration of the memory term is truncated 
from [0, t] to [t ~ 1, t] (for t>l). 

The system of random integrodifferential equations is solved repeatedly for differ- 
ent realizations of the noise. The initial condition for the resolved modes —3, —2, 2, 3 
is U2 = W-2 = "3 = "fi-s = 1, while the rest of the resolved modes are set initially 
equal to zero. 

Figs. H5.12(l , H5.13|l , H5.14|l show the real and imaginary parts of the short- memory 
estimates of the conditional expectations for the resolved modes 2,3,4 as compared to 
the truth and the Galerkin approximation. The true conditional expectations were 
computed by averaging over 10000 samples. The short-memory estimates were com- 
puted by averaging over 10000 realizations of the noise. As can be seen, there is good 
agreement between the short-memory approximation estimates for the conditional ex- 
pectations of the resolved modes and the true conditional expectations of these modes 
only for very short times. For longer times the errors become large. 

We show next that the results for the short-memory approximation depend mostly 
on the accuracy with which we approximate the memory term, and not so much on 
the accuracy of the approximation of the noise term. It is instructive to integrate the 
optimal prediction equations in the short-memory approximation with the memory 
term set to zero. Fig. H5.15|l shows the conditional expectation estimate for the resolved 
mode 2 as predicted by the short-memory approximation when the memory term is 
set to zero. We compare it to the truth and the short-memory approximation when 
the memory term is not set to zero. As is evident, the estimate is good for short 
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Fig. 5.13. Conditional expectation evolution for the resolved mode 3 for the second set of 
resolved variables, a) Real part, b) Imaginary part. 
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Fig. 5.14. Conditional expectation evolution for the resolved mode 4 for the second set of 
resolved variables, a) Real part, b) Imaginary part. 



times, while for longer times, the estimate converges to zero. Any deviations of the 
conditional expectations from zero, i.e. any improvement of the estimate, must be 
produced by the memory term. The deviations from zero, for long times, of the true 
conditional expectations, translate into memory effects for the set of resolved modes, 
and if these memory effects are not well represented in the model for the resolved 
modes, the prediction of the deviations is not accurate. 

Since the short-memory approximation estimates are not good for long times, 
we conclude that the quantities {LQe'''^^QLuj,Uk) do not start decaying fast after a 
short time. If they did, the short-memory approximation would give good results for 
longer times. The inclusion of an unstable mode in the unresolved modes, results in 
the appearance of long-time memory effects, but the precise mechanism of formation 
of these long-time memory effects has yet to be determined. 

For the second set of variables, no attempt was made to compute the conditional 
expectation estimates using the finite-rank projection or the delta-function approxi- 
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Fig. 5.15. Comparison of the conditional expectation evolution estimates for- the Tesolved mode 
2, as produced by the short-memory approximation and the short-memory approximation with the 
memory term set to zero, a) Real part, b) Imaginary part. 

mation. For the case of the finite-rank projection, there is no chance of improvement, 
since the hnear projection coefficients that are part of the finite-rank projection, are 
already slowly-decaying. For the case of the delta-function approximation, there is no 
incentive to try it, since the projection coefBcients do not decay fast. 

6. Conclusions. We have applied the optimal prediction formalism to the so- 
lutions of the Kuramoto-Sivashinsky equation in a Fourier-Galerkin truncation when 
some initial data are missing. The conditional expectations of the resolved modes 
conditioned on their initial values were estimated through simulation of the optimal 
prediction equations and compared to the true conditional expectations computed 
from simulations of the full system. The results of the comparison depend on which 
Fourier modes are contained in the set of resolved variables and on the type of pro- 
jection used in the memory term. For the case when the resolved variables include all 
the unstable modes, we used two different kinds of projection for the memory term, 
namely the linear projection and the finite-rank one. 

For the linear projection, the agreement between the optimal prediction estimates 
of the conditional expectations and the true conditional expectations is good for rel- 
atively long times. Also, the estimates show a considerable improvement over the 
Galerkin approximation, where we resolve a reduced set of variables and set all the 
unresolved variables equal to zero. For the case of the linear projection we, also, 
tried the more drastic delta-function approximation where the correlations appearing 
in the memory term integrand are replaced by a delta-function multiplied by the in- 
tegral. In this case, the resulting optimal prediction equations are differential (not 
integrodifferential) equations. The results are not as good as in the more sophisticated 
short-memory approximation. However, the accuracy of the conditional expectation 
estimates based on the delta-function approximation is impressive, if we consider the 
much greater numerical efficiency of the resulting system of equations for the resolved 
modes. 

For the finite-rank projection, the agreement is good only for short times. The 
growth of the error for larger times is due to the fact that the short-memory approxi- 
mation assumption is violated by the appearance of long-time memory effects. These 
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manifest themselves as slowly-decaying projection coefficients in the memory term 
integral. 

For the case where one unstable mode is left unresolved, we used only the linear 
projection. The error of the estimates of the conditional expectations becomes large 
after a short time. The growth of the error is due to the fact that the short-memory 
approximation assumption is violated by the appearance of long-time memory effects. 
The short-memory approximation estimates of the conditional expectations are still 
an improvement over the Galerkin approximation estimates. For the case where one 
unstable mode is left unresolved, we did not attempt to use the finite-rank projection, 
since the linear projection that is part of the finite-rank projection, already violates 
the short-memory approximation. 

In the case of the finite-rank projection for the first set of resolved modes, the 
long-time memory effects are a result of projecting on polynomials of the resolved 
modes of order higher than 1. For the second set of resolved modes, where we used 
the linear projection for the memory term, the long-time memory effects are a result 
of leaving an unstable mode as unresolved. Although the long-time memory effects 
for these two cases have different causes, the result is the same, namely the violation 
of the short-memory approximation assumption. This leads to a large increase of the 
error in the conditional expectation estimates after a short time. The inadequacy 
of the short-memory approximation to produce accurate estimates of the conditional 
expectations for long times, suggests that for cases with slowly-decaying projection 
coefficients, the calculation of the quantities [LQe^'^^QLuj.Uk) cannot be avoided 
(see inj for methods of computing these quantities) . 

The determination of the reasons for the appearance of long-time memory effects 
is important since it can provide insight on the way that the partial information 
known initially is lost when we perform underresolved computations. It can also help 
in determining which variables (or combinations of variables) of the full system should 
be resolved, if we hope to obtain a reduced model of the system that is accurate for 
long times |71[T7]. 
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